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QUANTILE CALCULUS AND CENSORED REGRESSION 

By Yijian Huang 1 

Emory University 

Quantile regression has been advocated in survival analysis to 
assess evolving covariate effects. However, challenges arise when the 
censoring time is not always observed and may be covariate-dependent, 
particularly in the presence of continuously-distributed covariates. In 
spite of several recent advances, existing methods either involve algo- 
rithmic complications or impose a probability grid. The former leads 
to difficulties in the implementation and asymptotics, whereas the lat- 
ter introduces undesirable grid dependence. To resolve these issues, 
we develop fundamental and general quantile calculus on cumulative 
probability scale in this article, upon recognizing that probability 
and time scales do not always have a one-to-one mapping given a 
survival distribution. These results give rise to a novel estimation pro- 
cedure for censored quantile regression, based on estimating integral 
equations. A numerically reliable and efficient Progressive Localized 
Minimization (PLMIN) algorithm is proposed for the computation. 
This procedure reduces exactly to the Kaplan-Meier method in the 
fc-sample problem, and to standard uncensored quantile regression in 
the absence of censoring. Under regularity conditions, the proposed 
quantile coefficient estimator is uniformly consistent and converges 
weakly to a Gaussian process. Simulations show good statistical and 
algorithmic performance. The proposal is illustrated in the applica- 
tion to a clinical study. 

1. Introduction. Quantile regression [Koenker and Bassett (1978)], con- 
cerning models for conditional quantile functions, has developed into a pri- 
mary statistical methodology to investigate functional relationship between 
a response and covariates. Targeting the full spectrum of quantiles, it pro- 
vides a far more complete statistical analysis than, say, classical linear re- 
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gression. This technique has a long history in econometric applications. More 
recently, quantile regression has also been advocated for survival analysis to 
address evolving covariate effects which is a common phenomenon in demo- 
graphic and clinical research among others. For instance, the aging process 
as well as the effects of its determinants can be vastly different at various 
stages of life [cf. Koenker and Geling (2001)]. On the other hand, a clin- 
ical intervention can rarely be expected to have a constant effect, due to 
the time lag in reaching full effect and to drug resistance. The quantile re- 
gression model allows for varying regression coefficients and thus suits these 
applications well. However, a main challenge arises from censoring. 

Denote the survival time by T and the censoring time by C. As a result of 
censoring, T is not directly observed but through follow-up time X = T A C 
and censoring indicator A = I(T < C), where A is the minimization operator 
and /(•) is the indicator function. Of interest is the relationship between T 
and a p x 1 covariate vector Z with constant 1 as the leading component. 
Quantile function is an inverse of the distribution function Fz(t) = Pr(T < 
t\Z). However, ambiguities arise in the presence of zero-density intervals; for 
example, zero mortality is not uncommon at the beginning of many clinical 
trials since new enrollees are relatively healthy. To be definitive, we adopt 
the cadlag inverse, that is, the inverse function that is right-continuous with 
left-hand limits. The rth conditional quantile of T given Z is defined as 

(1) Qz(T)=sup{t:F z (t)<r}, r€[0,l). 
The quantile regression model postulates that 

(2) Q z (t) = Z t P (t) VtG[0,1), 

where (3q(t), referred to as the quantile coefficient, is a function of proba- 
bility t. This model is semiparametric in general, but nonparametric in the 
/c-sample problem. The interest in evolving covariate effects necessitates the 
functional modeling of (2), which distinguishes itself from the modeling on a 
specific quantile as in, for example, median regression; see Section 6 for fur- 
ther discussion. Note that the time scale may be, say, logarithm-transformed, 
and accordingly the supports of T, C and X are not necessarily restricted to 
the nonnegative half line. When all components of Po(t) except the intercept 
are constant in r, this model reduces to the accelerated failure time mode 
as studied by Buckley and James (1979) and Tsiatis (1990) among others. 
In this regard, the quantile regression model is a varying-coefficient gener- 
alization. To provide an interpretation, suppose for the moment that model 
(2) holds on the logarithmic time scale. Then, the effect of a covariate, other 
than the leading 1 of Z, is to stretch or compress the baseline survival time 
(on the original scale) with a quantile-dependent stretching or compressing 
factor. 
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With uncensored data, Koenker and Bassett (1978) generalized sample 
quantile and proposed regression quantile as a quantile coefficient estimator 
via a convex objective function. An adaptation of the well-known Barrodale- 
Roberts algorithm was later suggested by Koenker and D'Orey (1987) for the 
computation. The reliability and efficiency of this algorithm contribute to 
broader acceptance of the standard methodology. In the presence of censor- 
ing, Powell (1984, 1986) proposed an estimation procedure when censoring 
time C is always observed. His approach applies uncensored quantile regres- 
sion to X as the rth quantile of X turns out to be Qz(t) A C = Z T (3q(t) A C . 

However, in most survival studies, not only is the survival time subject to 
censoring but also the censoring time is unobserved for uncensored individ- 
uals. Taking the missing-data perspective of censoring, Ying, Jung and Wei 
(1995) and Honore, Khan and Powell (2002) developed different methods 
but with the common consistent estimation requirement of the censoring 
distribution given covariates. This amounts to either an unconditional inde- 
pendence censoring mechanism, or a finite-number limitation on covariate 
values, or additional censoring-time modeling to achieve a root-ra conver- 
gence rate of the estimated censoring distribution. Obviously, none of these 
is desirable. Ying, Jung and Wei (1995) indicated that such a restriction 
may be relieved by employing smoothing techniques to nonparametrically 
estimate the conditional censoring distribution. As an alternative, Wang 
and Wang (2009) developed a method by nonparametrically estimating the 
conditional survival distribution via kernel smoothing. Nevertheless, Robins 
and Ritov (1997) argued that these smoothing-based methods may not be 
practical with moderate sample size in the presence of multiple continuously- 
distributed covariates; see also Portnoy (2003). 

Our investigation focuses on the same preceding data structure, but aims 
to allow for generalities on both the censoring mechanism and covariates. 
Specifically, we consider conditional independence censoring mechanism: 

(3) T±C\Z, 

where _L denotes statistical independence. This problem was first investi- 
gated by Portnoy (2003), who suggested the pivoting method employing the 
redistribution-to-the-right imputation scheme for censoring [Efron (1967)]. 
The mass of censored observations is recursively redistributed to adopt stan- 
dard uncensored quantile regression. However, one premise is the quantile 
monotonicity so that the "right," or future, is unequivocal in the redistri- 
bution. In the /c-sample case, the monotonicity holds in uncensored sam- 
ple quantile, and the method reduces to the Kaplan-Meier method, that is, 
taking an inverse of the Kaplan-Meier estimator. Unfortunately, uncensored 
quantile regression in general does not respect the monotonicity, leading to 
both algorithmic and analytic difficulties with Portnoy (2003). Indeed, the 
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asymptotic properties of the estimator have not yet been established; see 
Neocleous, Vanden Branden and Portnoy (2006). As an alternative, Neo- 
cleous, Vanden Branden and Portnoy (2006) advocated a closely related 
grid method. Most recently, Peng and Huang (2008) proposed a functional 
estimating function upon discovering a martingale structure, and developed 
a grid-based quantile coefficient estimator. Both uniform consistency and 
weak convergence have been established. As for the last two methods, the 
grid dependence, however, might not be completely satisfactory. 

This article makes two main contributions to this problem. First of all, 
fundamental and general quantile calculus is developed on probability scale, 
establishing the probability-scale dynamics with allowance for zero-density 
intervals and discontinuities in a distribution. Second, from quantile calculus 
a well-defined estimator and a reliable and efficient algorithm for censored 
quantile regression naturally emerge on the basis of estimating integral equa- 
tions. As compared with Portnoy (2003), Neocleous, Vanden Branden and 
Portnoy (2006) and Peng and Huang (2008), this new approach entails nei- 
ther algorithmic complications nor a probability grid. For the rest of this 
article, quantile calculus is presented in Section 2, and the proposed estima- 
tor and algorithm in Section 3. The asymptotic properties are investigated 
and an inference procedure suggested in Section 4. Section 5 presents simu- 
lation results on statistical and algorithmic performance, and an illustration 
with a clinical study. Section 6 concludes with discussion. The proofs are 
collected in Appendices A-E. 



2. Quantile calculus. Given a survival distribution, a one-to-many map- 
ping from probability to time scale may arise from zero-density intervals; 
adopting the cadlag definition of quantile function is a solution given in the 
Introduction. Reciprocally, a one-to-many mapping from time to probabil- 
ity scale may also arise, resulting from distributional discontinuities. Thus, 
time-scale theories including counting-process martingales cannot be applied 
to probability scale, unless continuity restriction is imposed on the distri- 
bution. In uncensored quantile regression, this issue may be bypassed by 
formulating the estimation as an optimization problem. However, such an 
approach may not be feasible in censored quantile regression, which calls for 
the development of quantile calculus. 



2.1. The one-sample case. Drop Z from the notation in this case. By 
definition, Pr{T < Q(r)} < r < Pr{T < Q(t)}. Thus, Q(r) does not cor- 
respond to a unique probability r when Pr{T = Q(t)} > 0. To fill in the 
missing piece, we introduce the rth quantile equality fraction: 

t-Pt{T<Q(t)} 



Pt{T = Q(t)} 
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which is the fraction of the probability mass at the quantile that brings the 
cumulative probability up to r. Here and throughout, we define 0/0 = 0. 
Elementary algebra then gives 

Pr{T<Q(r)} + Pr{T = Q(r)K(T) 

(4) 

= /Vr{T > Q(v)} - Pr{T = Q{ v )}^ v )]-±- Vr € [0, 1). 
Jo l ~v 

This result establishes the quantile dynamics on probability scale. More 
significantly, it can be readily exploited to accommodate censoring. Denote 
the limit of identifiability by r = sup{r : Pr{C > Q{t)} > 0}. 



Proposition 1. Suppose that T and C are independent and their dis- 
tributions do not have jump points in common. Consider integral equation 

E(A[I{X< q (T)}+I{X = q(T)} V (T)}) 

(5) 

= E I [I{ X >q(u)}-I{X = q(u)}r)(u)}-^- VrG [0,1), 
Jo i -v 

where q(-) is a cadlag function and rj(-) takes values in [0,1]. 

(i) // g (.) = Q(.) and t](t) = C(r) for all r such that Pr{T = Q{t)} > 0, 
then (5) holds. 

(ii) // (5) holds, then 

(6) q(r) = Q(r), 

(7) E[AI{X = q(r)} V (r)] = E[AI{X = Q(r)K(r)] 
for all t € (0,r). 



Remark 1. The condition that the distributions of T and C do not 
share jump points is practically needed for the identifiability of the former 
and therefore the corresponding quantile function as well. The role of t](t) is 
to split probability mass in the case of Pr{T = Q(r)} > 0. Equation (5), how- 
ever, does not determine tj(t) elsewhere. But instead of, say, setting t](t) to 
in those occasions, keeping the more general form would be advantageous 
for later developments. 



2.2. Quantile coefficient dynamics. Similar to the one-sample case, we 
assume the following assumption. 



Assumption 1 . The conditional distribution functions of T and C given 
Z do not have jump points in common for all values of Z. 
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Write £,z( T ) as the rth quantile equality fraction for the distribution of T 
given Z. Generalize the definition of identifiability limit as 

T=sup{T:E[Z® 2 I{C>Z T f3 (T)}} is nonsingular}, 

where v® 2 = vv T . The one-sample result of Proposition 1 can then be ex- 
tended. 

Proposition 2. Suppose that quantile regression model (2) and censor- 
ing mechanism (3) hold along with Assumption 1. Consider integral equation 

E(ZA[I{X < Z t P(t)} + I{X = Z T P(t)} Vz (t)]) 

(8) = E [ T Z[I{X>Z T p(v)}-I{X = Z T p(v)}r iZ (v)]-^- 

Jo i — v 

We [0,1), 

where /?(•) is a cadlag function and Z -dependent n z {-) takes values in [0, 1]. 

(i) // /?(•) = /3o(") ind, for any given Z , r\ z {r) = €z(t) for all r such 
that Pr{T = Q z (t)\Z} > 0, then (8) holds. 

(ii) In the case that both C and Z are discretely distributed, if (8) holds, 
then 

(9) /3(r)=/3 (r), 

(10) E[ZAI{X = Z J ' P(t)}tjz(t)] = E[ZAI{X = Z T /3 (r)}^(r)] 
for all t € (0,t). 

Remark 2. The admission of Pq(-) as a solution to integral equation 
(8) is general in the sense that no restriction on the survival and censoring 
distributions is imposed other than Assumption 1. But the uniqueness result 
of /3o(-) is provided only for the case that C and Z are discretely distributed. 
It is also established under some other conditions in Section 4. However, we 
do not yet have a proof for the most general case. 

Remark 3. Consider the censoring- absent special case with nonsingular 
E(Z® 2 ). Then, integral equation (8) reduces to 



D(t) = J\eZ-D{u)}- 



dv 



where D(t) is the left-hand side of (8). This equation has a unique and 
closed- form solution D(t) = tEZ, or 

(11) E(Z[I{T < Z t I3(t)} + I{T = Z t (3(t)}i ]z (t)]) = tEZ. 
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Note that ?]z( T ) affects the left-hand side at a nonsmooth point only, that 
is, when Pr{T = Z T (3(t)} ^ 0. With fixed rjz{ T ), the left-hand side may 
not be smooth in (3(t). Nonetheless, thanks to rjz(r), it can always be 
smooth in r and therefore, the equality of (11) is attainable. As far as /3(t) 
is concerned, the equation is equivalent to the minimization problem with 
E[{T - Z t /3(t)}- + t{T - Z t {3(t)}], where a" = -al(a < 0). Thus, Propo- 
sition 2 reduces to a well-known result in uncensored quantile regression. 

Remark 4. Quantile equality fraction £z(t) is a nuisance parameter. 
When Pr(Z = z) = for a given value z, Pr{T = Qz(t)\Z = z} and thus 
£z(t) are not identifiable. Nevertheless, only quantity E[ZAI{X = 
Z T Pq(t)}^z{ t )] as a whole is relevant to integral equation (8) and it is 
identifiable. As evident from Remark 3, the notion of £z( r ) might not be 
necessary for uncensored quantile regression, by employing minimization. 
But it is instrumental for our development of censored quantile regression. 

Remark 5. Proposition 2 is more general than the martingale result 
of Peng and Huang [(2008), equation (4)], whose validity is limited to the 
circumstance of continuous survival distribution. In that special case, the 
former may reduce to the latter since the mapping between time and prob- 
ability scales becomes one to one and all terms involving rjz(-) in integral 
equation (8) may vanish. Even so, the more general form of (8) is still desir- 
able in order to derive a natural estimating integral equation by the plug-in 
principle. After all, an empirical distribution is always discrete, that is, full 
of discontinuities and zero-density intervals. 

2.3. Relative quantile. To facilitate probability-scale analysis both con- 
ceptually and algebraically, we introduce the notion of relative quantile. An- 
chored at the rth quantile, the {r + A(l — T~)}th quantile coefficient for 
A € [0, 1) is referred to as the Ath relative quantile coefficient, written as 
Po(X,t) = (3q{t + A(l — t)}. This notion provides a convenient vehicle to 
study quantile coefficient /3o(") forward from a given probability, similar in 
spirit to the concept of hazard in survival analysis. 

An integral equation for /3o(A,r) can be derived with given D(t), the 
left-hand side of (8) at r. By algebraic manipulation, integral equation (8) 
implies 



where /3(A,r) = /3{r + A(l — r)} and r]z{\, r) = r]z{r + A(l — r)}. Apparently, 
this is a dual equation since equation (8) becomes a special case when r = 0. 



E(ZA[I{X < Z 1 13(X, t)} + I{X = Z 1 /3(A, t)}t? z (A, r)]) - D{r) 



(12) 




VA € [0, 1) 
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3. Proposed estimator and algorithm. 



dv 



3.1. Estimating integral equation. The data consist of (JQ, Aj,Zj), i = 1, 
. . . , ra, as n i.i.d. replicates of (X,A,Z). Proposition 2 leads naturally to our 
proposed estimation procedure based on the empirical version of integral 
equation (8): 

n 

^ZiAipiXi < Zj^+IiXi = Zj P{t)} Wi {t)\ 
i=i 

= [ ZiiHXi > ZjP(v)} - I{X, = Zj P(p)} Wi {p)]- 
i=i J o 1 

where Wi(-) takes values in [0,1]. Representing a convenient reparameter- 
ization of t]z(t) in (8), fraction Wi(r) serves the purpose of splitting the 
empirical probability mass associated with individual i when and only when 
Xi = Zj (3(t). For an uncensored individual, this ensures the continuity of 
&(r) = I{X, < Zjp{r)} + I{X, = Zj(3{T)} Wi {r). 

We shall say that b interpolates an observation (X, A, Z) if X = Z T b. 

Theorem 1 . Suppose that Y27=i ^f 2 is nonsingular. Estimating inte- 
gral equation (13) admits a solution /?(•) over r € [0,1) with the following 
properties: (i) /?(■) is cadlag; and (ii) /3(r) interpolates at least p individuals 
and the covariate matrix for the interpolated set is of full rank, for each and 
every r G [0, 1). 

Remark 6. Estimating integral equation (13) also admits a solution in 
the case of singular Y^i=i Zf 2 , by Theorem 1 upon eliminating parametriza- 
tion redundancy of the quantile coefficient. 



Remark 7. A subtle issue concerns the fact that identifiability limit r 
is unknown. Empirically, r cannot even be determined definitively to exceed 
any r > 0, which may be easily seen in the one-sample case. Although it is 
possible to estimate r, we do not terminate the estimation of (3q{-) at such 
an estimate but rather provide /3(r) for all r € [0,1); the properties of /?(•) 
would otherwise become more complicated. Precisely speaking, /3(r) is an 
estimator of /3q(t) provided r < r . This strategy of separating the estima- 
tion of /3o( r ) provided r < r from that of r is similar to that adopted by 
Peng and Huang (2008). In contrast, Portnoy (2003) and Neocleous, Vanden 
Branden and Portnoy (2006) terminated their estimation of Po(-) once the 
estimate becomes nonunique, which might partly explain the difficulties in 
their interval estimation. 
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Geometrically, f3(r) for each r is a hyperplane, partitioning the sample 
into 

{i : Xi < Zj (3(t)}, below set, 

* {i:Xi = Zj (3(t)}, interpolated set, 

k {i :Xi > Zj (3(t)}, above set. 

Each interpolated individual on the hyperplane may be split in a ratio of 
Wi(r) : {1 — Wi(r)} to be associated with the below and above sets, respec- 
tively. This gives rise to a sample bipartition indexed by r, and estimating 
integral equation (13) governs its evolution. 

3.2. Structuring the computation. Estimating integral equation (13) may 
be solved exactly with the proposed Progressive Localized Minimization 
(PLMIN) algorithm. The algorithm proceeds from the Oth quantile coeffi- 
cient upward in a progressive fashion. Due to sample discreteness, /?(•) is 
piecewise constant. We thus conveniently decompose the computation into 
sequential rounds, each involving that of a Oth relative quantile coefficient 
and a potential breakpoint. 

Suppose that (13) is solved up to r— , and thus 4>i(r— ) of every uncensored 
individual is available. Then, by continuity 0j(r) = (f>i(r—) of uncensored 
individual i is determined; obviously </>j(r) = in the case of r = 0. Inherited 
from the relationship between integral equations (8) and (12), estimating 
integral equation (13) is equivalent to the following equation for relative 
quantile coefficient: 
n 

Y,ZiAi[I{Xi < Zjp(\,T)} + I{X t = Zj P(\,r)}wi(\,T) - &(r)] 
i=i 

n rX 

(14) =^ / Zi[I{Xi>ZJl3{u,T)} 

i=i Jo 

- I{X t = Zj (3(v,T)} Wi (v,T)}-^, 

1 — V 

where it!j(A,r) = Wi{r + A(l — r)}. Since /3(A,r) remains constant from 
A = up to a potential relative breakpoint, say, H = Y17=i Zi[I{Xi > 
Zj /3(X,r)} — I{Xi = Zj t)}w{(t)] is locally constant, that is, for A € 
[0, A&). In the case that a censored individual becomes interpolated, adopt 
the convention that its iOj(A,r) remains constant locally. Write £(A) as the 
left-hand side of (14). Then, estimating integral equation (14) is locally 
equivalent to 

L(X)= / {H-L(u)}~ , A G [0, A;,), 

Jo i — v 
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which admits a unique solution L(X) = XH or equivalently, 

n 

^Z i A i [I{X i <zJ^(X,T)} + I{X i = Zj/3(X,T)}w i (X,T)-cl )i (T)] 



(15) ^ 



Aj> /{.V, > Zj(3(X,r)} - I{X t = Zjp(X,T)}wi(T)}. 



i=l 



Write /3(A,r) = /3{r + A(l — r)}. Since is cadlag, /3(0,r) is the solution 
to the above equation with A J, 0. Subsequently, is a A, typically the supre- 
mum A, such that the equation holds with /3(A,r) = (3(0, t). Furthermore, 
Wi(Xb—,T) of every interpolated uncensored individual will be determined. 
Thus, solving equation (13) moves forward to r + A&(1 — r). The PLMIN 
algorithm is so named since the computation will be conveniently carried 
out via minimization. 

3.3. Computing 0th relative quantile coefficient and potential breakpoint. 
With the same arguments following (11), solving (15) for /3(0,r) can be 
reformulated as a minimization problem: 

n 

/3(0,r) = limargmin V(Xi - Zjb) 

A4-0 . 

1=1 

x [I{X % > Zjb) - X- 1 A l {I(X l < Zjb) - & (r)}], 
which no longer involves Wi(-,r). Further algebraic simplification gives 



(16) ,3(0, r) = argmin^pQ - Zjb\ 

subject to 



h i=i 



X t < Zjb V» € D_ = {j : A j = 1, 4> 3 {t) = 1}, 
Xi = Zjb V» € Do = {J : Aj = 1, <^(r) € (0, 1)}, 
Xi > Vt € D+ = {j : Aj =l,(f>j (r) = 0}, 

where a + = a/(a > 0). For the special case of the 0th quantile coefficient, 



(17) 0(0) = argmin - Zjb)~ 

subject to 



6 i=i 



X;>Zjb Vi:A,- = l. 
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The minimization of (16) is a piecewise- linear programming problem with 
convex objective function, characterized by the following lemma to Theorem 
1. Note that, once 0(0, r) is determined, so is Wi(r) of a 0(0, r)-interpolated 
uncensored individual by continuity of 4>i(r). 

Lemma 1. Suppose that the condition of Theorem 1 holds and that 
covariates Zi, i 6 Do, are linearly independent. There exists a minimizer 
0(0, t) for problem (16) such that the covariate matrix for 0(0, t) -interpolated 
observations is of full rank. Furthermore, there exist (i) a p -member subset 
§ of 0(0, t) -interpolated observations with Do CS; and (ii) for any 0(0, t)- 
interpolated censored individual i, 



uii(r) € 



[0,1], ifieS, 
{0, 1}, otherwise, 



such that (hi) X = {Zi :i G §} is of full rank; and (iv) H = J27=i ^iVi^i — 
Zj 0(0, t)} — I{Xi = Zj 0(0, r)}wi(r)] as determined satisfies 

(is) y. z ^=h 

for some ji , where ji < for i € D_ and 7, > for i € D + . 

Piecewise-linear programming can be viewed as extended linear program- 
ming, although a 0(0, r)-interpolated individual may be a censored one and 
thus not involved in the constraints. We devise an algorithm aiming at the 
determination of the p-member interpolated subset S, the same strategy as 
the simplex method of linear programming [e.g., Gill, Murray and Wright 
(1991)]. To locate a candidate member of S, the method of steepest descent 
is used. Note that a feasible value for 0(0, r) is readily available. In the case 
of t = 0, any value with a sufficiently small intercept component is feasi- 
ble. Subsequently, 0(t—) is feasible as necessary by continuity of 4>i(-) for 
uncensored individuals. The minimization along a given feasible direction is 
reached once an uncensored observation becomes interpolated, or potentially 
so if the interpolated observation is a censored one instead. The constrained 
space is of dimension p minus the size of Bo- F° r 0(0), there is no equality 
constraint and the dimension is p. For following 0th relative quantile coeffi- 
cients, typically the dimension is 1 in which case the minimization involves 
only a line search. To deal with the possibility of more than p interpolated 
individuals, the perturbation anti-cycling technique in linear programming 
[e.g., Gill, Murray and Wright (1991), Section 8.3.3] can be adapted. In the 
perturbation, one may follow a tie-breaking convention to let individuals in 
D + precede censored ones, which in turn precede those in B_. This mini- 
mization is numerically reliable and efficient. 
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The minimization determines /3(0,r), S, Wi(r) for each in S, and 7$ for 
each uncensored in S. Plugging them into (15) yields 

(19) Yl ZiMM^ r)-Wi(T)} = \J2 ZiA ai . 

Simple algebra then gives the potential relative breakpoint 

m) . J min /(7»>0)-u>,(r) £ g; = 

(2UJ A b - < ieS . A . =li7 .^ 7i 

I 1, otherwise, 

which is proper in the sense of < A5 < 1. The lower bound of A;, is obvious, 
whereas the upper bound can easily be established by analyzing the inter- 
cept component of (18) and (19). If At = 1, the final quantile is reached. 
Otherwise, for those uncensored, 

Wi(Xb-,r) =Wi(r) + X b ji, ieS:A* = l. 

At least one Wi(Xb—,r) above reaches or 1, so is the corresponding 0j{r + 
A{,(1 — r)}. Note that Aj, is a breakpoint if /3(t) interpolates exactly p in- 
dividuals; but not necessarily so otherwise. Nevertheless, of importance in 
both cases is that the solution moves forward in a sensible fashion. 

When r is small, S typically consists of uncensored individuals only. But as 
r becomes larger, interpolated censored individuals could emerge when /?(r) 
might still be uniquely determined nonetheless. Eventually, the computation 
could reach a point beyond which (3(t) is no longer unique. Apparently, 
this phenomenon relates to the identifiability issue; see Remark 7. On a 
different note, just like uncensored quantile regression, this censored quantile 
regression may not respect quantile monotonicity in general. 

3.4. Relationships with standard methods in special cases. In the absence 
of censoring, estimating integral equation (13) reduces to 

n n 

(21) ^2Zi[I{Ti < Zjpirft + IiTi = Zjp{T)}w i {T)]=TY J Zi 

i=l i=l 

by the same approach to obtaining (11) from (8). Thus, /?(•) is the cadlag 
function /?(■) that minimizes X^ili^i — Zj f3{r)}~ + r{Tj — Zj (3(t)}], re- 
ducing to one regression quantile of Koenker and Bassett (1978); note that 
the Koenker-Bassett estimator is not always uniquely defined. In the mean 
time, 1 — </>i(r) becomes I{Ti > Zj (3(t)} — I{Ti = Z j r /3(r)}u'i(r), which is 
the regression rank score of Gutenbrunner and Jureckova (1992). 

On the other hand, in the one-sample case, /?(-) reduces exactly to the 
cadlag inverse of the Kaplan-Meier estimator. It is clear from (17) that (5(0) 
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is the first failure time and from (20) that the breakpoint is the Nelson- 
Aalen estimate of the hazard at /3(0). Subsequently and more generally, each 
estimated 0th relative quantile is a failure time and the relative breakpoint 
is the Nelson-Aalen hazard estimate. In case that the last observation is 
censored, the final estimated quantile is defined as this last Jbllow-up time 
by convention. More generally, in the /c-sample problem, /?(•) is a linear 
combination of cadlag inverses of the k Kaplan-Meier estimators. 

4. Asymptotic study and inference. In our developments thus far, we 
have kept our assumptions to minimal. But the generality challenges large- 
sample developments in both exposition and technicalities; see Section 6 
for further discussion. In this section, we shall focus on the situation that 
Fz is continuous and free of zero-density intervals, and additionally C is 
continuously distributed. These regularity conditions were also adopted in 
previous investigations [Portnoy (2003), Neocleous, Vanden Branden and 
Portnoy (2006), Peng and Huang (2008)]. Nevertheless, Portnoy (2003) and 
Neocleous, Vanden Branden and Portnoy (2006) required the absence of 
censoring prior to and around the 0th quantile. On the other hand, Peng 
and Huang (2008) presumed that the 0th quantile is — oo. In contrast, we 
do not impose any conditions on the 0th quantile. 

A parameter space needs to be specified. In light of the interpolation 
property of the estimator by Theorem 1, we require that any b in such a 
parameter space satisfies that E[Z® 2 I{Z T /3 (0) < Z T b < Z T /3 (1-) A C}] is 
nonsingular. Write eigmin as the minimum eigenvalue of a matrix. Specifi- 
cally, a parameter space containing 0o(t) for all r € [0, it] is given by 

B( u ) = {b£Rx C p _i : eigmin E[Z® 2 I{Z T Po(0) < Z T b 

<Z T f3 (l-)AC}]>c(u)}, 

where constant u < r, compact space C p _i C and positive constant 

c(u) < eigmin E{Z® 2 I{C > Z T /3q(u)}]. Thus, all slope components are bounded 
but the intercept may be — oo. 

Write |j • || as the Euclidean norm. Let Fz{t) = 1 — Fz(t) and Gz{t) = 
1 — Gz(t) = Pr(C > t\Z). Adopt the following regularity conditions: 

CI. t > and \\Z\\ is bounded; 

C2. Fz and Gz have density functions fz and gz, which both are continuous 

and bounded, uniformly in t and Z\ 
C3. /?o(') is Lipschitz-continuous on [ti,T2] for any t\ and t% such that 

0<ri <t 2 <1; 

C4. there exist u G (0,r) and a parameter space M(u) such that the maxi- 
mum singular value of 

*(6) = E{Z^Fz(Z T b)gz(Z T b)}[E{Z^ 2 Gz(Z T b)fz(Z T b)}r 1 

is bounded uniformly in 6 € M(u)\dM(u), where d denotes the boundary. 
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The first two conditions are self-explanatory. Conditions C3 implies that the 
survival distribution does not have zero-density intervals between Qz (0) and 
Qz(l—)- Imposing constraints on censoring, condition C4 is a sufficient and 
technical one to accommodate the possibility of unbounded /?o(0). 

Throughout this section, /?(•) is any cadlag solution to estimating integral 
equation (13). The solution may not be unique, nor is the interpolation 
property in Theorem 1 necessary. 

Theorem 2. Suppose that quantile regression model (2) and censor- 
ing mechanism (3) hold along with conditions C1-C4. Equation (8) implies 
P(t) = A)(t) for all t £ (0, u] . For any I G (0, u) , sup Te[u] ||/3(r) - /3 (r) || -> 

almost surely. Furthermore, n 1//2 {/3(r) — /3o( r )} converges weakly to a Gaus- 
sian process on [l,u]. 

Remark 8. Integral equation (8) is an initial value problem, and es- 
timating integral equation (13) is its empirical version. Accordingly, the 
large-sample study as provided in Appendix D exploits classical differential 
equation theory and modern empirical process theory. Our study bears sim- 
ilarities with that of Peng and Huang (2008). Indeed, under the continuity 
condition of C2, (13) is essentially equivalent to the estimating function of 
Peng and Huang [(2008), equation (5)] since Wi(-) becomes negligible; see 
also Remark 5. Nevertheless, we spare the inductive arguments of Peng and 
Huang (2008) in their asymptotic study as typically necessary for a grid 
method, by virtue of the fact that /?(•) is an exact solution to (13). Equally 
noteworthy is that the generality here on the 0th quantile requires a more 
delicate treatment. 

Remark 9. In the case that Z T (3q(0) is — oo for all Z, our estimator 
P(-) is asymptotically equivalent to that of Peng and Huang (2008) provided 
that mesh size of the grid as required by the latter is of order o(n -1 / 2 ). 

To make inference, the distribution of n 1 / 2 {/3(-) — /?o( - )} needs to be esti- 
mated. For their estimator, Peng and Huang (2008) adapted the resampling 
approach of Jin, Ying and Wei (2001). We adopt the same approach by per- 
turbing estimating integral equation (13). This procedure is equivalent to a 
multiplier bootstrap as described in Kosorok [(2008), Section 2.2.3]. 

Theorem 3. Suppose that the conditions of Theorem 2 hold, and that 
nonnegative random variable £ has unit mean and unit variance and satis- 
fies J °° Pr(£ > x) 1 / 2 dx < oo. Perturb estimating integral equation (13) by 
assigning i.i.d. random variables of the same distribution as £ and inde- 
pendent of the data to individuals in the sample, and denote a solution to 
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the perturbed equation by /?*(•). On [l,u], n 1 / 2 {/3(-) — A)(')} ^ as the same 
asymptotic distribution as n 1 / 2 {/3*(-) — /?(•)} conditionally on the data. 

The standard exponential distribution, for example, may be used to gener- 
ate these perturbing random variables. By repeatedly simulating perturbed 
samples, the conditional distribution of /?*(•) can be obtained as an approx- 
imation for the distribution of /?(•)■ 

5. Numerical studies. The quantile regression model is formulated in 
A)(-)- But alternative covariate-effect measures can be practically useful and 
were used in our application (Section 5.3). Write 

^o(ri,r 2 ) = (r 2 - n)" 1 / f3 (u) dv. 

J Tl 

Model (2) implies 

(22) fo-Ti)" 1 f 2 Qz(is)dv = Z T v (T 1 ,T 2 ), 

J Tl 

where the left-hand side is a trimmed mean of T. Therefore, ^o(ti,t 2 ) mea- 
sures trimmed mean effect. This measure is versatile through the choices 
of ti and r 2 . In fact, /3o( r ) = nm z4r Mo( r ) v ) is a special case. On the other 
hand, /io(0, 1) is the mean effect, that is, the regression coefficient in the 
linear regression model. Originally suggested as an average effect measure 
by Peng and Huang (2008), /^(tIi 7 ^) becomes even more appealing in light 
of its specific interpretation as revealed. With censored data, /xo(ti,T2) is 
identifiable when r 2 < r, and a natural estimator is given by 

PT2 ^ 

M T i,T2) = (r 2 -ny 1 j f}(v)dv. 

Jti 

Obviously, /i(ri,r 2 ) with < t\ < r 2 < u is strongly consistent and asymp- 
totically normal under the conditions of Theorem 2. The variance can be 
estimated by using the simulated distribution of /?*(•)• Our numerical expe- 
rience suggested that /2(ri,r 2 ) behaves reasonably well even when t\ takes 
0. 

5.1. Finite-sample statistical performance. Simulations were conducted 
to mimic a clinical trial. On the original time scale, the baseline survival 
distribution was standard exponential and the censoring distribution was 
uniform on [0,5]. The quantile regression model held on the logarithmic 
time scale, with two nonconstant covariates: Z\ was Bernoulli of probability 
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Table 1 

Statistical assessment under models with two nonconstant covariates 



Proposed Portnoy, pivoting Portnoy, grid Peng-Huang 



T 


B 


SD 


SE 


CI 


B 


SD 


CI 


B 


SD 


CI 


B 


SD 


CI 










Scenario 1: 


with varying covariate effect 










0.1 


1 


521 


551 


93.6 


-1 


518 


90.5 


-51 


526 


90.6 


10 


518 


93.5 




3 


474 


518 


96.2 


3 


474 


95.2 


-7 


479 


94.7 


4 


475 


95.7 




7 


794 


866 


95.1 


8 


793 


95.2 


6 


805 


93.7 


8 


793 


95.1 


0.3 


1 


325 


337 


94.4 


3 


324 


92.4 


-17 


323 


91.8 


16 


323 


94.2 




3 


311 


325 


94.3 


3 


310 


93.9 


-3 


312 


93.5 


6 


310 


94.8 




-7 


540 


549 


94.9 


-9 


539 


93.3 


-8 


537 


93.5 


-9 


541 


93.6 


0.5 


-6 


254 


258 


93.4 


-1 


286 


89.8 


-19 


253 


90.3 


9 


253 


93.2 







232 


240 


94.9 


-3 


238 


92.8 


-1 


231 


94.2 


-1 


231 


95.1 




O 


A Oft 
4Uo 


A 1 A 


94.3 


-1 


451 


93.2 


4 


408 


QO 7 

yz. ( 


D 


/111 
411 


Q/1 9 


0.7 


-5 


235 


248 


95.9 


-16 


294 


90.5 


-20 


233 


90.6 


12 


240 


95.2 




9 


220 


239 


95.8 


-6 


400 


95.0 


6 


213 


95.2 


14 


223 


95.9 




5 


384 


405 


96.0 


53 


1096 


94.3 


2 


383 


94.1 


11 


391 


96.5 










Scenario 2: ; 


iccelerated failure time model 










0.1 


3 


506 


534 


94.2 


1 


502 


91.2 


-50 


511 


90.8 


14 


504 


93.5 




-4 


447 


490 


96.6 


-3 


445 


95.5 


-6 


451 


95.5 


-3 


444 


95.8 




8 


756 


820 


95.4 


9 


750 


94.4 


6 


768 


94.3 


5 


755 


95.4 


0.3 


1 


303 


316 


93.8 


1 


302 


92.1 


-17 


301 


91.8 


16 


302 


94.6 




2 


270 


286 


94.9 


1 


270 


94.5 





269 


94.6 


-1 


268 


95.5 




-4 


480 


497 


94.7 


-6 


481 


93.7 


-6 


479 


93.7 


-5 


481 


94.3 


0.5 


-5 


252 


254 


93.2 


4 


324 


89.8 


-19 


250 


90.9 


11 


251 


92.7 




2 


229 


234 


94.7 


-2 


254 


92.9 


1 


227 


93.7 


1 


229 


94.2 




5 


405 


404 


94.0 


-7 


473 


92.4 


4 


404 


92.8 


4 


406 


93.7 


0.7 


-3 


235 


248 


95.7 


101 


2395 


90.9 


-19 


233 


90.4 


13 


240 


95.3 




7 


221 


239 


95.6 


76 


2532 


94.1 


5 


214 


94.9 


12 


223 


95.7 




2 


384 


405 


96.0 


-192 


5077 


93.8 





384 


94.5 


9 


391 


96.5 



Three rows for each r correspond to the intercept and two slope components of estimated 
rth quantile coefficient. 

B: empirical bias (xlOOO); SD: empirical standard deviation (xlOOO); SE: empirical mean 
of the standard error (xlOOO); CI: empirical coverage (%) of 95% confidence interval. 



0.5 and Z2 uniform on [0, 1]. We considered two scenarios with the following 
conditional quantile functions: 

Q z (t) = log{- log(l - r)} + (1.25r A 0.5)Zi + 0.5Z 2 , 
Q z (t) = log{- log(l - r)} + 0.5Zl + 0.5Z 2 . 

Scenario 1 above involved a ramp-up effect of Z±, going from none to full 
linearly with probability r and staying constant afterwards. In contrast, 
scenario 2 followed the accelerated failure time model. 
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The sample size was 200. Under each scenario, simulations were conducted 
with 1000 iterations. For both scenarios, the censoring rate was approxi- 
mately 32%. Table 1 reports the summary statistics for the proposed rth 
quantile coefficient estimates ranging from r = 0.1 to 0.7, along with esti- 
mates based on the pivoting method of Portnoy (2003), the grid method of 
Portnoy [Neocleous, Vanden Branden and Portnoy (2006)], and Peng and 
Huang (2008). The two Portnoy's methods are available in R Quantreg pack- 
age, of which the latest version at the time of this research, 4.20, was used. 
The default mesh size, 0.01 in this case, was adopted for the grid method of 
Portnoy, and the same mesh size for Peng and Huang. For point estimation, 
the pivoting method of Portnoy had large bias and variance at r = 0.7 under 
both scenarios. Other than that, these estimators all had negligible bias and 
similar efficiency. But the bias of the proposed estimator seemed smaller. 
These findings are not surprising since the estimator of Peng and Huang is 
asymptotically equivalent to the proposed in the settings under study; see 
Remark 9. In addition, similar efficiency between Peng and Huang and the 
grid method of Portnoy was already observed in Peng and Huang (2008). 
For interval estimation, Peng and Huang employs the same procedure as 
the proposed, whereas the methods of Portnoy use bootstrap. The resam- 
pling size was set to 200 for all these methods. The standard error of the 
proposed estimator agreed with the standard deviation well. The Wald-type 
95% confidence intervals of both the proposed and Peng and Huang achieved 
reasonably accurate coverage probability. In contrast, the bootstrap confi- 
dence intervals of Portnoy's methods had undercoverage particularly for the 
intercept, a finding consistent with that reported in Peng and Huang (2008). 

These preceding stimulation settings conform to the conditions of the 
asymptotic study in Section 4. Additional settings with distributional dis- 
continuities and zero-density intervals of the survival time were also consid- 
ered. One such simulation involved a setting similar to the preceding ones 
but having a discontinuous baseline survival distribution: 

Qz{t) = log{- log(l -rV 0.4)} + (r V 0A)Zx + 0.5Z 2 , 

where V denotes the maximization operator. Unfortunately, the pivoting 
and grid methods of Portnoy as implemented in R Quantreg package had 
numerical difficulties and their appropriate evaluation was not permitted. 
Both the estimator of Peng and Huang and the proposed had negligible bias 
at r = 0.1 and 0.3. However, for r = 0.5 and 0.7, the absolute median-bias 
of Peng and Huang reached 0.136, 0.065 and 0.041 for the intercept and two 
slopes, respectively. In comparison, the absolute median-bias of the proposed 
estimator was no larger than 0.026 for all three coefficients. Here, median- 
bias is a more appropriate summary than mean-bias due to the skewness 
of these estimators resulting from discontinuity in the survival distribution. 
These results were expected since the validity of Peng and Huang is tied to 
the assumption of continuous survival distribution; see Remark 5. 
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5.2. Algorithmic performance. The proposed method was compared 
against the pivoting method of Portnoy, the grid method of Portnoy, and 
the Peng and Huang method implemented by Koenker (2008). All these 
existing methods are implemented with Fortran source code in R Quantreg 
package. The original implementation of Peng and Huang in R language was 
inappropriate for comparison due to the inherent slower speed of R. For the 
two grid methods, the default mesh size, 0.01 A n~°- 7 /2 for sample size n, 
was adopted. The proposed method was also implemented in R with For- 
tran source code. To minimize the impact of R overhead, calling the Fortran 
function of a method from R was timed. The computation was performed 
on a Dell 2950 computer with 3.0 GHz Pentium Xeon X5365 CPUs. 

The survival time followed the accelerated failure time model with p — 1 
nonconstant covariates 

p ( -\\m-l 

logT = £ +^i=^— ZM, 

m=2 

where e followed the extreme- value distribution, and , m = 2, . . . ,p, were 
independent and uniformly distributed on [0, 1] . The number of nonconstant 
covariates ranged from 1 to 8, and the sample size from 100 to 1600. Three 
levels of censoring, 0%, 25% and 50%, were investigated. Unless there was 
no censoring, the censoring time followed the uniform distribution between 
and a censoring rate-determined upper bound. Computational reliability and 
efficiency of various methods for point estimation of the quantile coefficient 
process were assessed with 1000 iterations, shown in Table 2. 

Both the proposed and Peng and Huang methods were reliable. However, 
the pivoting and grid methods of Portnoy tended to cause frequent R session 
crashes in case of no censoring and more covariates. Furthermore, in the 
presence of censoring, the pivoting method of Portnoy might terminate with 
warning or error messages. This rate increased with the number of covariates 
and censoring rate, up to 67%. 

The computer time of the proposed method was roughly constant over 
different censoring levels, given sample size and number of covariates. Com- 
paratively, the proposed is faster than other methods uniformly in all sce- 
narios considered. Specifically, the pivoting method of Portnoy cost 1.6 to 
6.7 times as much computer time, the grid method of Portnoy cost 1.8 to 
5.9 times, and the Peng and Huang method cost 10.5 to 47.5 times. This 
result is remarkable since the grid methods involve much fewer grid points 
than breakpoints of the proposed method at larger sample size, suggesting 
that a grid point could be much more costly to compute. 

5.3. Application to a clinical study. For illustration, we applied the pro- 
posed estimation procedure to the Mayo primary biliary cirrhosis dataset 



Table 2 

Algorithmic evaluation of timing and reliability 



Number of nonconstant covariates 



Sample size 








1 






I 






4 






8 




100 


Prop 


T 


0.6 


0.5 


0.5 


0.7 


0.6 


0.6 


1.0 


0.9 


0.9 


1.6 


1.4 


1.5 




PP 


R 


1.9 


1.9 


1.6 


1.9 


1.9 


1.7 


2.4 


2.4 


2.2 


3.6 


3.6 


3.2 






W 








4 





1 


11 





8 


24 





28 


65 




PG 


R 


3.4 


3.4 


2.4 


3.9 


3.6 


2.8 


4.9 


4.4 


3.4 


5.9 


5.4 


4.2 




PHK 


R 


47.5 


37.7 


19.7 


42.5 


34.8 


20.5 


40.4 


32.4 


19.4 


32.0 


26.6 


16.3 


200 


Prop 


T 


1.4 


1.3 


1.3 


1.7 


1.7 


1.6 


2.4 


2.2 


2.3 


4.1 


3.9 


4.0 




PP 


R 


2.3 


2.2 


1.7 


2.4 


2.3 


2.0 


3.2 


3.2 


2.9 




4.9 


4.4 






VV 





1 


4 





2 


12 





6 


26 




31 


67 




PG 


R 


3.0 


2.7 


2.0 


3.4 


2.9 


2.3 


4.5 


3.9 


2.9 




4.7 


3.6 




PHK 


R 


43.4 


31.3 


16.9 


37.1 


28.3 


16.9 


35.3 


27.8 


16.4 


27.1 


20.8 


12.7 


400 


Prop 


T 


4.5 


4.3 


4.4 


5.5 


5.5 


5.8 


7.7 


7.7 


8.0 


13.2 


13.1 


13.7 




PP 


R 


2.7 


2.4 


1.8 


2.7 


2.5 


2.1 


3.8 


3.6 


3.1 




5.6 


5.0 






W 








8 





2 


13 





6 


27 




32 


67 




PG 


R 


2.8 


2.5 


1.8 


3.2 


2.7 


2.0 


4.2 


3.6 


2.7 




4.9 


3.4 




PHK 


R 


39.0 


28.4 


15.1 


33.4 


24.5 


13.8 


31.4 


23.1 


13.9 


23.4 


17.6 


10.8 


800 


Prop 


T 


16.3 


16.4 


16.9 


20.4 


20.7 


21.4 


29.2 


29.2 


30.1 


48.1 


47.6 


49.5 




PP 


R 


2.9 


2.5 


1.9 


2.9 


2.7 


2.2 


3.9 


3.7 


3.2 




6.1 


5.5 






W 





1 


7 





2 


18 





7 


27 




30 


67 




PG 


R 


3.1 


2.6 


1.8 


3.4 


2.9 


2.1 


4.6 


3.8 


2.8 




5.3 


4.0 




PHK 


R 


39.1 


27.9 


14.8 


33.0 


24.1 


13.9 


29.5 


21.9 


13.4 


22.6 


17.1 


10.7 


1600 


Prop 


T 


63.1 


63.9 


66.0 


80.5 


79.6 


81.9 


112.6 


109.3 


113.1 


177.6 


173.3 


180.0 




PP 


R 


3.0 


2.7 


2.0 


2.9 


2.7 


2.2 


4.0 


3.8 


3.3 




6.7 


6.0 






W 





1 


7 





4 


20 





10 


28 




33 


62 




PG 


R 


3.5 


2.9 


2.0 


3.7 


3.1 


2.2 


4.9 


4.0 


3.0 




5.5 


4.3 




PHK 


R 


38.0 


27.5 


14.7 


30.2 


22.5 


13.6 


27.7 


20.9 


13.0 


21.8 


16.3 


10.5 



Prop: the proposed estimation; PP: the pivoting method of Portnoy (2003); PG: the grid method of Portnoy [Neocleous, Vanden Branden 
and Portnoy (2006)]; PHK: Peng and Huang (2008) implemented by Koenker (2008); T: CPU time (millisecond) of point estimation; R: 
timing relative to the proposed estimation; W: termination rate with warning or error (%). Timing for PP was based on iterations free 
of warning and error. Three columns under each combination of sample size and number of covariates correspond to 0%, 25% and 50% 
censoring rates. — : unavailable due to software crash. 
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[Fleming and Harrington (1991), Appendix D]. Conducted at Mayo Clinic 
between 1974 and 1984, the study followed 418 patients with primary bil- 
iary cirrhosis, a rare but fatal chronic liver disease. One question of inter- 
est was concerned with prognostic factors associated with survival. In this 
analysis, we considered five baseline measures: age, edema, log(bilirubin), 
log(albumin) and log(prothrombin time). Two participants had incomplete 
measures and were thus removed. Our analysis data consisted of 416 pa- 
tients, with a median follow-up time of 4.74 years and a censoring rate of 
61.5%. 

We adopted the quantile regression model on the logarithmic time scale, 
with the five baseline measures as covariates. The estimated quantile co- 
efficient processes are shown in Figure 1, along with pointwise Wald 95% 
confidence intervals. The resampling size for the interval estimation was 200. 
The maximum cumulative probability up to which the estimated quantile 
coefficient was unique is 0.91. Among the five covariates, log(prothrombin 
time) in particular exhibited a prominent varying effect. It was negatively 
associated with survival time for short survivors, but the effect diminished 
gradually for longer survivors. This result echos findings from analyses of this 
dataset with other varying-coefficient models, for example, by Tian, Zucker 
and Wei (2005) using the varying-coefficient Cox model. Nevertheless, this 
varying effect was not apparent in the model with log(prothrombin time) as 
the only covariate, shown in Figure 2. 

The graphical presentation is revealing of the covariate effect evolution. 
To summarize, estimated upper-trimmed mean effects and standard errors 
are given in Table 3. For comparison, the estimates based on the accelerated 
failure time model using the log-rank and Gehan estimating functions are 
also included. Notice that the two estimated regression coefficients deviate 
from each other for log(prothrombin time) with the accelerated failure time 
model. This disparity also suggests a lack of fit of this sub-model. In this 
situation, estimates from the accelerated failure time model are difficult to 
interpret. In contrast, the estimated upper-trimmed mean effects from the 
quantile regression model are meaningful, for covariates with constant or 
varying effects alike. 

6. Discussion. Quantile calculus as developed proves useful and effective 
for quantile regression. With uncensored data, it offers a new perspective of 
the standard regression procedure. Most importantly, censoring can be nat- 
urally accommodated, and it gives rise to our proposed censored regression 
via solving a well-defined estimating integral equation. To focus on the main 
ideas, we have not addressed second-stage inference and model diagnostics, 
which are practically useful. They can be developed along the lines similar 
to those in Peng and Huang (2008). 
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log(bilirubin) 



0.0 0.2 0.4 0.6 0.8 1.0 




log(pro. time) 




Cumulative probability 



Fig. 1. Analysis of the Mayo primary biliary cirrhosis data. Estimated quantile coef- 
ficient processes are shown in rugged lines, along with pointwise Wald 95% confidence 
intervals given by vertical bars. The three regions on which the estimated quantile coeffi- 
cient hyperplanes are (a) unique with uncensored S-members only, (b) unique with both 
uncensored and censored S-members, and (c) nonunique are marked on bottom horizontal 
lines. 



For survival data, alternative models exist to address varying covariate 
effects. One better known varying-coefficient model is the additive haz- 
ards model of Aalen (1980). There is also an extensive literature on the 
varying-coefficient Cox model, but most available estimation methods re- 
quire smoothing; see Tian, Zucker and Wei (2005) and the references therein. 
More recently, Peng and Huang (2007) extended the class of semiparametric 
linear transformation models to allow for varying coefficients. In compari- 
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2.2 2.4 2.6 2.8 



log(pro. time) 

Fig. 2. Analysis of the Mayo primary biliary cirrhosis data with \og(prothrombin time) 
as the only covariate. Dots and open circles represent uncensored and censored individuals, 
respectively. Estimated decile coefficients are shown from r = up to 0.8. Solid, dashed, 
and dotted lines represent the corresponding hyperplanes that are (a) unique with uncen- 
sored S-members only, (b) unique with both uncensored and censored S-members, and (c) 
nonunique, respectively. 

son with all these alternatives, the quantile regression model is particularly 
attractive with its simple interpretation; see the Introduction. 

Table 3 

Analysis results of the Mayo primary biliary cirrhosis data 



Accelerated failure time model Quantile regression model 





log-rank 


Gehan 


A*o(0, 


0.8) 


Mo(0, 


0.9) 




Est 


SE 


Est 


SE 


Est 


SE 


Est 


SE 


Age 


-0.0259 


0.0051 


-0.0255 


0.0051 


-0.0238 


0.0055 


-0.0227 


0.0056 


Edema 


-0.7627 


0.2276 


-0.9241 


0.2556 


-0.8616 


0.2413 


-0.8048 


0.2297 


log(bilirubin) 


-0.5724 


0.0519 


-0.5581 


0.0611 


-0.5504 


0.0638 


-0.5465 


0.0615 


log(albumin) 


1.6312 


0.4436 


1.4985 


0.5013 


1.4756 


0.4729 


1.4955 


0.4438 


log(pro. time) 


-1.9176 


0.5807 


-2.7761 


0.8056 


-2.1220 


0.8665 


-1.9426 


0.8190 



Est: estimate; SE: standard error. 
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When the inferential goal is on a specific quantile, the quantile regression 
model for the given r only, or the singleton model, is of direct interest. In 
this case, methods for uncensored quantile regression [Koenker and Bassett 
(1978)] and censored one with always-observed censoring time [Powell (1984, 
1986)] are still applicable. But when censoring time is only observed for cen- 
sored individuals, the proposed method as well as Portnoy (2003), Neocleous, 
Vanden Branden and Portnoy (2006) and Peng and Huang (2008) may not 
be applied unless the quantile regression model holds from the 0th through 
the rth quantile. In contrast, the approaches of Ying, Jung and Wei (1995) 
and Honore, Khan and Powell (2002) do not necessarily require the func- 
tional model but at the price of a more restrictive censoring mechanism. A 
choice between these two classes of methods depends on whether functional 
survival-time modeling or censoring-time modeling might be more reason- 
able and justifiable in a specific application. The method of Wang and Wang 
(2009) is appealing in this regard, but might have practicality concerns in 
the case of multiple continuously-distributed covariates, as discussed in the 
Introduction. 

Generalizing the asymptotic results given in Section 4 is of interest, say, 
to allow for zero-density intervals and discontinuities in the survival dis- 
tribution. Unfortunately, difficulties include the open question on solution 
uniqueness for integral equation (8), as indicated in Remark 2, and more. 
These additional ones can be readily seen in the one-sample case. First, the 
notion of consistency might not even be appropriate in evaluating the esti- 
mated quantile corresponding to a zero-density interval. Indeed, consistent 
estimation might be impossible in this circumstance but the estimated quan- 
tile is nonetheless informative of the estimand. Second, a distributional dis- 
continuity might ruin asymptotic normality of the corresponding estimated 
quantile. These issues become much more complex and also have broader im- 
plication in the general case. Due to the sequential nature of the estimation, 
of concern are not only those corresponding quantile coefficients but also 
the subsequent ones. Nevertheless, it seems reasonable to conjecture that 
consistency and asymptotic normality might still hold for estimated quan- 
tile coefficients other than those corresponding to zero-density intervals and 
distributional discontinuities. 

Several additional problems are also worth further investigation. First, our 
focus has been on the estimation of /3o(t) provided r < f , and the estimation 
of identifiability limit r remains to be addressed; see Remark 7. Second, 
the submodel with a mixture of constant- and varying-coefficients would be 
useful when constant effects are determined a priori for some covariates. 
Efficiency gain might be expected over the more general method in this 
article. Third, in addition to right censoring, quantile regression with other 
types of censoring and truncation is also of interest. But new techniques 
might be needed. Finally, the proposed 0th quantile coefficient estimator 
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as given by (17) might be of interest in its own right. In the absence of 
censoring, our estimator reduces to the extreme regression quantile studied 
by Smith (1994), Portnoy and Jureckova (1999) and Chernozhukov (2005) 
among others. Some of their results may be extended. 

APPENDIX A: PROOF OF PROPOSITION 1 

Consider assertion (i). Given that rj(r) = £(r) for all r such that Pr{T = 
Q(t)} > 0, equation (4) still holds when £(■) is replaced by rj(-). Therefore, 



/ T [Pr{C > Q{v)} - Pr{C = Q(u)}r,(u)] 
Jo 

x d[Pr{T < Q{y)} + Pr{T = Q(v)}r)(v)} 

= [ T [Vt{C>Q{u)}-Vt{C = Q{u)} V (u)} 
Jo 



x [Pr{T > Q(u)} - Pr{T = Q(v)} v (v)}^- Vr € [0, 1). 

The above equation simplifies to (5) under the given conditions. 

For assertion (ii), only the case of r > needs to be considered. The 
definition of r implies E(A[I{X > Q(r)} - I{X = Q(r)}f (r)]) = 0. Thus, 

(23) E(A[I{X < Q(r)} + I{X = Q(r)K(r)]) = EA. 

Define r* = sup{r:Pr{C > q{v)} > \/v G [0, r]}. The same argument as 
before gives 

(24) E(A[I{X < q(r*)} + I{X = q(r*)}r,(r*)]) = EA. 

Given the continuity of the left-hand side of (5) in r, the above equation 
implies r* > 0. Since Pr{C > q(r)} > for any r £ [0,r*), (5) under the 
given conditions implies 

Pr{T < q(r)} + Pr{T = g(r)}r/(r) 

= [l-Pr{T <</(!/)} -Pr{T = ( Z (i/)}» 7 (i/)]- Vre[0,r*). 

Jo i—v 

The above integral equation has a unique solution: 

Pr{T < ? (r)} + Pr{T = 9(r)}r/(r) = r Vr G [0, r*), 

from which (6) and (7) follow for r € (0,r*). Furthermore, (23) and (24) 
imply r* = r . This completes the proof. 
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APPENDIX B: PROOF OF PROPOSITION 2 

Existence result (i) follows directly from Proposition 1. We now prove 
uniqueness result (ii) by construction. Start from r = 0. Write EI as the 
discrete distributional support of (C,Z), and define 

n = supjV : I{c > z T /?(^i)} = lim I{c > z T f3(u 2 )} and c ^ z T ' P(v x ) 

Wi e (0,r] and (c,z) 6l}. 

Thus, I{C > Z T /3(r)} remains constant and C ^ Z T /3(t) over r G (0,Ti) 
almost surely. Locally, (8) reduces to 

D{r)= f T { Y (r)-D(u)}-^, r€[0,n), 
JO i — i/ 

where D(r) is the left-hand side of (8) and Y(t) = E[ZI{C > Z t P(t)}} 
is constant over r E (0, ri). The above equation admits a unique solution 
Z>(r) =rF(r), or equivalently 

E(Z/{C7 > Z t P(t)}[I{T < Z t /3(t)} + J{T = Z T /3(r)}r/ z (r) - r]) = 0, 

re(0,n). 

By arguments similar to Remark 3, /3(r) is the minimizer of E[{X - Z T @{t) A 
C} _ + t{X — Z T (3(t) A C}]. Recognizing that this minimization problem is 
the basis for Powell's (1984, 1986) estimator, we then obtain (9) for r G (0, Ti) 
so long as f > 0. Given (9), integral equation (8) implies 

J(r) = - ? J[y)^- re [0,n), 

where J(r) is the difference between the two sides of (10). Thus, (10) is 
obtained for r £ (0,7i) by an application of the Gronwall's inequality. 
Under Assumption 1, one can show 

lim E(Z[I{X > Z t P(t)} - I{X = Z T P(t)}i 1z {t)]) 

= (1 - ti) lim E[ZI{C > Z t /3(t)}}. 
Tin 

Then, by taking advantage of the notion of relative quantile and integral 
equation (12), results (9) and (10) can be established inductively beyond ri, 
up to r. 
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APPENDIX C: PROOF OF LEMMA 1 AND THEOREM 1 

With the developments in Section 3, it remains to establish Lemma 1. 
Given the existence of a feasible value for /3(0, r), nonnegativity of the ob- 
jective function in (16) ensures the existence of a minimizer. Furthermore, 
note that the objective function becomes linear upon adding Xi < Zjb or 
Xi > Zjb for each censored individual to the constraints. Therefore, this 
piecewise-linear programming problem becomes the minimization of a set 
of linear programming problems, where each member involves additional 
constraints concerning censored individuals. It is known that a linear pro- 
gramming problem has a vertex solution if a bounded solution exists [e.g., 
Gill, Murray and Wright (1991), Section 7.8.2]. Assertion (iii) of Lemma 1 
then follows. 

For assertion (iv), we only consider the situation that the interpolated 
set is of size p; otherwise one may work with the corresponding perturbed 
problem. Write E>c as the subset of censored individuals in S. The following 
two linear programming problems have the same solution as (16): 



subject to 



min —A T b 
b 



X t <Zjb Vi G D_, Xi = Zjb Vi G Bo, 
Xi>Zjb Vi G D+, Xi<Zjb ViGSc, 

min -[ A+ ^2 Zi) b 



b 



subject to 

Xi < Zjb Mi G D_, Xi = Z> b Vi G i 

•T 
3 



Xi>Zjb ViGD+, Xi>Zjb ViGSc 



where A = £ i6Do {l - ^(r)}Z; + £ ieD+ Z t + £ i: Ai=0)Xi>z t^ t) Z t . Of 
course, the above two coincide in the case of Sc* = 0. Applying Gill, Murray 
and Wright [(1991), Theorem 7.8.1] yields 

for some 7> , where 7> ' < for i G D_, 7} > for i G D+, and 7^ < 

and 7 | 2 ^ > for z G §c- Since Z§ is of full rank, 7^ = 7j - 2 ^ for i G S \ Sc 

and 7J- 1 "* = 7 t - 2 ' ) — 1 for i G §0 Therefore, -ff as determined upon setting 

Wi (r) = ) G [0, 1] for i G S c satisfies (18), with 7i = 7 [ } for i G S\Sc- This 
completes the proof. 
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APPENDIX D: PROOF OF THEOREM 2 



Similar to Peng and Huang (2008), we introduce monotone maps Ti(b) = 
E{ZAI(X < Z T b)} and r 2 (6) = E{ZI(X > Z T b)}. Write their empirical 

counterparts as Ti(6) = n~ l Y17=i ZiA{I(Xi < Zjb) and T2Q)) = n~ l Ym=\ x 
l(Xi > Zjb). Under condition C3, T x (b) is a one-to-one map for b € B(u) and 
Y^\-) exists. Write H(a) = T 2 {T^ 1 (a)} and note dH(a)/da T = -^{T^ia)}- 
n, where is defined in condition C4 and n is the p x p identity matrix. 

Identifiability. Write a(r) = Ti{P(t)}, and integral equation (8) can be 
written as 

dv 



a(r) 



H{a(v)} 



o 



1 



note that terms involving rjz(-) vanish under condition C2. Condition C4 im- 
plies that H (a) is Lipschitz-continuous. The Picard-Lindelof theorem then 
asserts the solution uniqueness, that is, a(-) = ao(-) = Pi{/3o(t)}. It follows 
that /3(r) = /3 (r) for all t e (0,u\. 

Consistency. It is known that {I(X < Z T b):b € MP} is Donsker [e.g., 
Kosorok (2008), Lemma 9.12]. Furthermore, Z is bounded under condition 
CI. By permanence property of the Donsker class, {ZAI(X < Z T b):b € 
M p } is Donsker. So is {ZI(X > Z T b) : b E M p } by similar arguments. Since 
Donsker implies Glivenko-Cantelli, almost surely 



(25) 



su P ||r i (6)-r i (6)|| = (i) 



J 



1,2. 



On the other hand, condition C2 implies that sup bgKP Y17=i H^i = Zjb) < 
p almost surely. Then, coupled with condition CI, with any Wi E [0,1], al- 
most surely 



sup 



n 



i=l 



sup 

bsRP 

Therefore, almost surely 



1 J2z i A i I(X i = Zjb)(w i -l 

n 



(26) 



sup 

t£[0,u] 



TMr)}- / r 2 {/3H} 



dv 



l-v 



--Oin' 1 ), 
--Oin- 1 ). 

0(n- 1 ), 



since (13) can be written as 

n 

f !{/3(r)} + n- 1 ZiAiHXt = Zj P(t)}{ Wi {t) - 1} 



i=l 
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T 2 {(3(v)} - n~ x J2 Z i T iXi = zJ/3(v)} Wi (i 



du 



and /?(•) is a solution. 

Following (25) and (26), almost surely 



sup 

re[0,it] 



a(r) 



H{a(u)} 



du 



1 



o(l), 



where 2(r) = Ti{/3(t)}. Write L as the Lipschitz constant of H(-). Thus, 
for every e > and with sufficiently large n, almost surely 



|a(r) -a (r)|| < 



< 



\\H{a(u)}-H{a (u)}\ 
du 



L\\a(u) - a (u)\\ 



1 



1 - 



+ e 



which leads to 
(27) 



\\a(T)-a {T)\\<e(l-T)- Jj , r€ [0,u], 

by the Gronwall's inequality. Therefore, a(r) is strongly consistent for ao( r ) 
uniformly in r € [0,u]. 

It remains to show that, for any e > 0, there exists S > such that 
su Pre^n]ll«( r ) - «o(r)|| < 6 implies sup rgM \\/3(r) - Po(t)\\ < e. Suppose 
that the assertion is false. Thus, for each 5 > 0, there exists (6, u) such that 
||ri(i>) — ao(^)|| < 5 and \\b — (3o(u)\\ > d for some constant d > 0. Then, there 
is a subsequence of (b,u) that converges to, say, (bo,vo). This means that 
ri(fro) = Ti{/5o( l/ o)} but &o 7^ A)(^o)- However, conditions CI and C3 im- 
ply that fz{Z T Pq(t)} is bounded below away from uniformly in r G [l,u] 
and Z. Therefore, dT 1 {b)/db T = E{Z® 2 G z {Z T b)f z (Z T b)} at b = P (v ) is 
positive definite, which along with the monotonicity of Ti(-) gives rise to a 
contradiction. 

Weak convergence. 

Lemma 2. Under the conditions in Theorem 2, 

sup ||f !{^(r)} - vSiT)} - ri{/5 (r)} + ^{^(r)}!! 

re[0,w] 



(28) 



(29) 



o p (n 



-l/2> 



sup 

re[0,u] 



[r 2 {/3(^)} - r 2 {/3(^)} - r 2 {/3 (z,)} + r 2 {/? (^)}] 



o p (n 



-1/2, 
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Proof of Lemma 2. Consider (28) first. Since {ZAI(X < Z T b):be 
RP} is Donsker, nV 2 {f 1(6) - Tx(b)} converges weakly to a tight Gaussian 
process. The tightness implies that, for every e > and m = 1, . . . ,p, 

lim lim sup Pr ( sup n 1 ^ 2 \T^ n \bi) 

610 rwoc \, &2:CT [ni/2{f( m) (b 1 )-f( m) (fc 2 )}]<5 

-rS m) (6!)-fl m) (6 2 ) 

+ T[ m \b 2 )\>e 

where a(-) denotes standard deviation and superscript ( m ) the mth compo- 
nent of a vector; see, for example, Kosorok (2008). Furthermore, note that 

CT 2 [n i/2 { rM (&i) _ f M (&2)}] 

= a 2 [Z^ m) A{I(X < Z T b l ) - I(X < Z T b 2 )}} 
< E{Z^ 2 A\I(X < Z T h) - I(X < Z T b 2 )\}. 

Write T(/3 , far) = E[A\I(X < Z T b)-I{X < Z T /3o(r)}|]| fe= ^ {r) . Given con- 
dition CI, it then suffices to show 

(30) sup T(/3 ,At) = O (1) 

t£[0,u] 

almost surely. Let c/ be the upper bound of fz(-)- Apparently, 
T(/3 ,M < c f E[\Z T {b - f3 (r)}\]\ b= ^ T) 
<c0(r)-Mr)\\E\\Z\\. 
Following the consistency of /3(-), for every / > 0, 

(31) sup T(/3 ,M = O (l) 

re[l,u] 

almost surely. On the other hand, 

T(/3 J,T)<rS 1) {^(T)} + rS 1) {/3o(r)} 

< 2rS 1) {/3 (r)} + [if ] 0(r)} ~ {Po(r)}\. 

Therefore, following the consistency of a(-), for every e > 0, there exists 
r e > such that 

(32) sup T(/3 J,T)<e 

re[0,r e ] 

almost surely for sufficiently large n. Combining (31) and (32) gives (30). 
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Now consider (29). Arguments similar to the above establish that, for 
every I > 0, 



(33) sup ||r 2 {/3(r)} - r 2 {/3(r)} - r 2 {/3 (r)} + r 2 {/3 (r)}|| = o p (n 



-1/2^ 



rel,u 



Since sup 6eRP ||T 2 (6) — r 2 (6)|| = O p (n 1 ^ 2 ), for every e > 0, there exists r e > 
such that 



(34) 



Pr( sup 

^rG[0,r e ] 



n l l 2 \f 2 0{v)}-T 2 {P(u)} 



r 2 {A)W} + r 2 {A)H}] 



> e ->• 0. 



Then, (29) follows from (33) and (34). □ 



Plugging (28) and (29) into (26) yields that, for r £ [0,u] 



a 



du 



(r)-ao(r)- / [H{a(u)} - H{a (u)}]- + 0p (n~^) 

.In I — U 

du 



-ri{A)(r)}+ / r 2 {/3 (^)} 



1-z/ 



where o p (-) is uniform for r £ [0, u). Under conditions C2 and C4, i?{S(r)} — 
i?{ao(r)} = -[^{PoiT)} + 11 + o(l)][a(r) - a (r)] almost surely. Therefore, 

a(r) - a (r) + + n][a(i/) - a W]t^- 



(35) 



+ 0p (n- 1 /2 + || a( . ) _ ao( . ) || ) 
= -f 1 {/3 (r)}+ f f 2 {A)(*/)} 



du 
1-u' 



The remaining proof is sketched since it essentially follows that of The- 
orem 2 in Peng and Huang (2008), where more details can be found. Note 
that the right-hand side of (35) is a martingale and converges weakly to 
a Gaussian process by the martingale central limit theorem [e.g., Flem- 
ing and Harrington (1991)]. Furthermore, (35) as a differential equation of 
S(-) — ao(-) can be solved by using product integration theory [Gill and 
Johansen (1990)], establishing that a(-) — «o( - ) as a linear map of the right- 
hand side converges weakly to a Gaussian process. The weak convergence of 
/?(•) then follows by the functional delta method. 
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APPENDIX E: PROOF OF THEOREM 3 

Throughout, a quantity based on the perturbed sample is denoted by 
adding an asterisk. For example, is the perturbed version of T\(b). 

The same arguments of the consistency proof in Appendix D may be used 
to show the strong consistency of for ao(-) on [0, u] and that of /?*(•) 
for Po(-) on [l,u], upon establishment of the following two results. First, by 
Kosorok [(2008), Theorem 10.13], almost surely 

sup||f*(&)-r,-(&)||=o(l), j = 1,2. 



Second, the terms involving Wi(-) in the perturbed version of (13) are neg- 
ligible, by the fact that the maximum of the n i.i.d. perturbing random 
variables is almost surely o(ra 1//2 ) [Owen (1990), Lemma 3]. 

By an unconditional multiplier central limit theorem [Kosorok (2008), 
Theorem 10.1 and Corollary 10.3], n 1 / 2 ^-) - Tj(-)}, j = 1,2, converge 
weakly to tight processes. The arguments in the proof of Lemma 2 then can 
be used to establish 

sup \\ft0*(r)} - ri{^(r)} - r;{/3b(r)} + r 1 {A)(r)}|| 

re[0,u] 

dv 



sup 

r€[0,u] 



[T* 2 {p*(v)} - r 2 {p*(v)} - r* 2 {Po(v)} + r 2 {A,H}] 



1-v 



Thus, along the lines to establish (35), one obtains 

a*(r) - oo(r) + fmPoiy)} + n][a» - «oM]^- 

io i — v 

(36) +0t ,( 7l -V2 + ||S*(.)_ ao (.)||) 



-fi{A)(r)}+ fnmv)} 

Jo 



dv 



given that a perturbing random variable is almost surely o(n 1 / 2 ). 
Following from (35) and (36), 

S*(r) - S(r) + f [*{#)(")} + n][S» - a(u)]-^- 
Jo 1 - v 

(37) + 0p („-V2 + ||3*(.)_ S (.)||) 

= -n{/3 (r)} + f 1 {/3 (r)}+ f [f^oH} - ? 2 {(3 (v)}] ' 



1-v' 
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since \\a(-) - a (-)\\ = Opin" 1 / 2 ). Note that both AI{X < Z T (3 (r)} and 
Jo I{X > Z t /3 (v)}{1 - vY l dv are monotone in r. Therefore, {AI{X < 
Z T x /3 (t)} - j T I{X > Z T (3 Q {v)}{l -v)- l dv.T£ [0,it]} is Donsker and so 
is {Z[AI{X < Z t (3 (t)}- £ I{X > Z T [3 (v)}(l - v)" 1 du]:re [0, u]}. By a 
conditional multiplier central limit theorem [Kosorok (2008), Theorem 10.4], 
the right-hand side of (37) conditionally on the data converges weakly to the 
same Gaussian process as the right-hand side of (35). Then, the assertion of 
Theorem 3 follows the arguments at the end of Appendix D. 
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